Search CORE

2 research outputs found

Establishing and optimising unmanned airborne relay networks in urban environments

Author: Pawel Ladosz (1250613)
Publication venue
Publication date: 01/01/2019
Field of study

This thesis assesses the use of a group of small, low-altitude, low-power (in terms of communication equipment), xed-wing unmanned aerial vehicles (UAVs) as a mobile communication relay nodes to facilitate reliable communication between ground nodes in urban environments. This work focuses on enhancing existing models for optimal trajectory planning and enabling UAV relay implementation in realistic urban scenarios. The performance of the proposed UAV relay algorithms was demonstrated and proved through an indoor simulated urban environment, the rst experiment of its kind.The objective of enabling UAV relay deployment in realistic urban environments is addressed through relaxing the constraints on the assumptions of communication prediction models assumptions, reducing knowledge requirements and improving prediction efficiency. This thesis explores assumptions for urban environment knowledge at three different levels: (i) full knowledge about the urban environment, (ii) partially known urban environments, and (iii) no knowledge about the urban environment. The work starts with exploring models that assume the city size, layout and its effects on wireless communication strength are known, representing full knowledge about the urban environment. [Continues.]</div

Loughborough University Institutional Repository

Deep reinforcement learning with modulated Hebbian plus Q-network architecture

Author: Andrea Soltoggio (1248822)
Eseoghene Ben-Iwhiwhu (6115286)
Jeff Dick (6923729)
Jeffrey L. Krichmar (781258)
Nicholas Ketz (418685)
Pawel Ladosz (1250613)
Praveen K. Pilly (304122)
Soheil Kolouri (5061395)
Publication venue
Publication date: 24/09/2021
Field of study

In this article, we consider a subclass of partially observable Markov decision process (POMDP) problems which we termed confounding POMDPs. In these types of POMDPs, temporal difference (TD)-based reinforcement learning (RL) algorithms struggle, as TD error cannot be easily derived from observations. We solve these types of problems using a new bio-inspired neural architecture that combines a modulated Hebbian network (MOHN) with deep Q-network (DQN), which we call modulated Hebbian plus Q-network architecture (MOHQA). The key idea is to use a Hebbian network with rarely correlated bio-inspired neural traces to bridge temporal delays between actions and rewards when confounding observations and sparse rewards result in inaccurate TD errors. In MOHQA, DQN learns low-level features and control, while the MOHN contributes to high-level decisions by associating rewards with past states and actions. Thus, the proposed architecture combines two modules with significantly different learning algorithms, a Hebbian associative network and a classical DQN pipeline, exploiting the advantages of both. Simulations on a set of POMDPs and on the Malmo environment show that the proposed algorithm improved DQN's results and even outperformed control tests with advantage-actor critic (A2C), quantile regression DQN with long short-term memory (QRDQN + LSTM), Monte Carlo policy gradient (REINFORCE), and aggregated memory for reinforcement learning (AMRL) algorithms on most difficult POMDPs with confounding stimuli and sparse rewards

Loughborough University Institutional Repository